Learning Concept Taxonomies from Multi-modal Data
نویسندگان
چکیده
We study the problem of automatically building hypernym taxonomies from textual and visual data. Previous works in taxonomy induction generally ignore the increasingly prominent visual data, which encode important perceptual semantics. Instead, we propose a probabilistic model for taxonomy induction by jointly leveraging text and images. To avoid hand-crafted feature engineering, we design end-to-end features based on distributed representations of images and words. The model is discriminatively trained given a small set of existing ontologies and is capable of building full taxonomies from scratch for a collection of unseen conceptual label items with associated images. We evaluate our model and features on the WordNet hierarchies, where our system outperforms previous approaches by a large gap.
منابع مشابه
Assessment of learning style based on VARK model among the students of Qom University of Medical Sciences
Introduction: Learning is a dominant phenomenon in human life. Learners are different from each other in terms of attitudes and cognitive styles which effect on the learning of people. In this connection, VARK learning style assess the students base their individual abilities and method for obtaining much information from environment in dimensions of visual, aural, read/write, and kinesthetic. ...
متن کاملHigh-order Deep Neural Networks for Learning Multi-Modal Representations
In multi-modal learning, data consists of multiple modalities, which need to be represented jointly to capture the real-world ’concept’ that the data corresponds to (Srivastava & Salakhutdinov, 2012). However, it is not easy to obtain the joint representations reflecting the structure of multi-modal data with machine learning algorithms, especially with conventional neural networks. This is bec...
متن کاملA Nonparametric Bayesian Model of Multi-Level Category Learning
Categories are often organized into hierarchical taxonomies, that is, tree structures where each node represents a labeled category, and a node’s parent and children are, respectively, the category’s supertype and subtypes. A natural question is whether it is possible to reconstruct category taxonomies in cases where we are not given explicit information about how categories are related to each...
متن کاملExploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning
Learning video concept detectors automatically from the big but noisy web data with no additional manual annotations is a novel but challenging area in the multimedia and the machine learning community. A considerable amount of videos on the web are associated with rich but noisy contextual information, such as the title, which provides weak annotations or labels about the video content. To lev...
متن کاملImage-Text Multi-Modal Representation Learning by Adversarial Backpropagation
We present novel method for image-text multi-modal representation learning. In our knowledge, this work is the first approach of applying adversarial learning concept to multi-modal learning and not exploiting image-text pair information to learn multi-modal feature. We only use category information in contrast with most previous methods using image-text pair information for multi-modal embeddi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1606.09239 شماره
صفحات -
تاریخ انتشار 2016